Considering Functional Spreadsheet Operator Usage Suggests the Value of Example Driven Modelling for Decision Support Systems

نویسندگان

  • Simon R. Thorne
  • David Ball
چکیده

Most spreadsheet surveys both for reporting use and error focus on the practical application of the spreadsheet in a particular industry. Typically these studies will illustrate that a particular percentage of spreadsheets are used for optimisation and a further percentage are used for ‘What if’ analysis. Much less common is examining the classes of function, as defined by the vendor, used by modellers to build their spreadsheet models. This alternative analysis allows further insight into the programming nature of spreadsheets and may assist researchers in targeting particular structures in spreadsheet software for further investigation. Further, understanding the functional make-up of spreadsheets allows effective evaluation of novel approaches from a programming point of view. It allows greater insight into studies that report what spreadsheets are used for since it is explicit which functional structures are in use in spreadsheets. We conclude that a deeper understanding of the use of operators and the operator’s relationship to error would provide fresh insight into the spreadsheet error problem. Considering functional spreadsheet operator usage suggests the value of Example Driven Modelling for Decision Support Systems Electronic Spreadsheets and business Research has shown that the use of spreadsheets in organisations is pervasive, used in many industries and for many purposes. This pervasive use of spreadsheets is reflected in a gradual increase in spreadsheet error research; this is partly due to the slow realisation that spreadsheets, as end user tools, are error prone and that spreadsheets are used in strategic business applications Most End User Development (EUD) academic surveys such as (Davies 1987, Jenne 1996 and Taylor et al. 1998) rarely focus on spreadsheets. It is therefore difficult to precisely say who uses spreadsheets and for what purpose. However, there are some surveys that provide data on spreadsheet usage specifically. Pemberton and Robson (2000) surveyed 227 respondents from a mix of Private and Public sector organisations in the UK. They found that over 80% of respondents were spreadsheet users at some level. The activities spreadsheets were used for ranged from data storing to Decision Support Systems (DSS) implemented in spreadsheets. The Spreadsheet Engineering Research Programme (SERP) is an international research effort aimed at surveying spreadsheet usage across a number of large organizations. It is not currently complete but has already surveyed over 1300 participants in a number of organizations in the United States and the United Kingdom. SERP (2005) found that spreadsheets are used predominantly for alternative modeling (what if analysis) with 98% of respondents indicating so. This was closely followed by 90% using spreadsheets for data analysis. The most common technique used in these models is statistical analysis (49%) followed by optimization techniques (31%). Interestingly the results presented by SERP (2005) differ significantly to Pemberton and Robson (2000). The SERP data suggests that most spreadsheets are used for “what if” analysis using statistical techniques, conversely Pemberton and Robson (2000) found that statistical analysis was the least popular use for spreadsheets, the most popular being data sorting. Further, Brancheau and Wetherbe (1990) found that the adoption of spreadsheet applications by users started in 1981 with VisiCalc and grew significantly until 1987 when the survey was completed. In historical terms 1987 marked the introduction of GUIs and subsequently excel was introduced in the same year. Taylor et al. (1998) surveyed 34 UK organisations and found that 85% of those organisations used spreadsheets as their primary method for EUD. However, since this study was aimed at EUD and not spreadsheets, there is no further detail on type of use. Considering functional spreadsheet operator usage suggests the value of Example Driven Modelling for Decision Support Systems: Thorne & Ball Page 148 In addition, there are case studies and audit experiences written by both academics and practitioners. Such investigations are usually driven by some financial loss or realisation that the practice they are partaking in carries some significant risk. Two such cases are Fernandez (2002) and Gosling (2003) who both surveyed spreadsheet usage and policy in their respective organisations. Both found that spreadsheets are used as low-level data stores and for calculation of trivial items such as expenses. They also discovered that spreadsheets had become part of the IT infrastructure of the organisations and that the business would be seriously affected if these spreadsheet applications failed. This clearly shows a strategic reliance on spreadsheets for decision analysis. Croll (2005) reports on the usage of spreadsheets in the financial markets in London. Croll found that there is great reliance on spreadsheets for modelling as the below quote from one of the participant responses highlights. “Excel is utterly pervasive. Nothing large (good or bad) happens without it passing at some time through Excel” (Croll, 2005) The evidence provided so far in this section suggests spreadsheets as the most commonly used end user tool. There is also evidence to suggest that spreadsheets are used in a strategic manner (Croll 2005, Fernandez 2002, Gosling 2003). Croll’s evidence comes in the form of quotes from participants studied. Both Fernandez and Gosling are vertical case studies in two different organisations. Fernandez investigates an international private company and Gosling a large National Health Service (NHS) trust. The aim of both studies was to establish the use of spreadsheets in each organisation and examine the implications of their use. Both Fernandez and Gosling found that spreadsheets were pervasive in the organisations and that spreadsheets were used in a strategic manner. In some cases spreadsheets were used to make decisions on how an entire department was run. Spreadsheets had also become part of the Information Systems Architecture, removing data from the corporate system, manipulating it and then re entering it into the Corporate system. Clearly this practice of merging validated and un-validated data is undermines the integrity of corporate systems. Spreadsheet Errors The first study into spreadsheet error was conducted by Brown and Gould for IBM in 1987. This study took 9 experienced spreadsheet developers and examined their performance when asked to create a number of spreadsheets from scratch. They found that 44% of the spreadsheets developed contained errors such as mistyping formulae. This study was conducted because the authors had noted that business spreadsheet usage had boomed and that it had been suggested that spreadsheets might contain errors. Since this original paper, there have been many studies in spreadsheet error and the statistics reported from these studies varies from 30% to 100% of models with errors. Table 1 depicts some experimental studies with relevant error rates. Author and Year Percentage of models with errors Hicks and Panko, 1995 91% Javrin and Morrison, 1996 84% KPMG, 1997 91% Panko and Halverson, 1997 80% Javrin and Morrison, 2000 95% Table 1 Spreadsheet error rates These statistics have led other researchers to investigate spreadsheet error in more depth, asking a variety of pertinent questions: what causes errors, how many types are there, how can errors be reduced or removed. Thus far most spreadsheet error research studies have considered error from a general point of view, i.e. a focus on the number of models produced with error or the average number of errors per cell. Whilst this information is useful for estimating the extent of error in certain domains it does not necessarily tell us why error comes about or what causes the error. Research into taxonomies of error such as (Panko and Halverson 1998, Teo and Tan 1999 and Ayalew et al 2000, Rajalingham et al. Considering functional spreadsheet operator usage suggests the value of Example Driven Modelling for Decision Support Systems: Thorne & Ball Page 149 2000, Rajalingham 2005) have all examined and defined error types in varying levels of detail. These error types are not specific to the actual application, rather they are generic. Limitations of current spreadsheet research General surveys of spreadsheet error have traditionally focused on the end user products, i.e. the studies are themed according to the final application of the technology. They are often written from a management point of view, highlighting inadequacies in policy or practice and attributing these inadequacies to poor spreadsheet quality. Typical output would be the percentage of spreadsheets used in the accounting industry together with the percent used for optimisation. Whilst this serves a purpose, it does not shed any new light on the nature of spreadsheet error or what causes spreadsheets to be so error prone. An alternative view of reporting spreadsheet error is to examine the programming structures that spreadsheets are composed of. Programming structures in spreadsheets consist of formulae constructed utilising built in vendor operators. A deeper understanding of the use of operators and the operators relationship to error would provide fresh insight into the spreadsheet error problem. Within spreadsheet software, there are a number of vendor defined classes of function, each function contains various operators that relate to the class they are a member of. Functional Classes in Spreadsheet Software As defined by the vendor Microsoft, there are 11 classes of function offered with the standard Excel spreadsheet software. Excel is chosen since it is the most commonly used spreadsheet application according to Walchenbach (2005). Walchenbach states that Excel now accounts for 90% of the spreadsheet market, although it is difficult to determine the exact number of Excel users, in 1997 alone Microsoft shipped over 70 million copies of Excel 97. These classes contain operators to be used in formulae expressions and are grouped according to their actual purpose. The 11 classes contain varying amounts of operators ranging from 5 to 78 operators in a class, offering a grand total 343 unique operators. The 11 class groupings are shown in table 2. Class Name Number of operators Database 12 Date and Time 20 Financial 53 Engineering 39 Information 18 Logical 6 Look-up and Reference 17 Math and Trigonometry 60 Statistical 78 Text 35 External linking 5 Table 2 Excel function classes Studies of functional usage As previously mentioned, there have been very few documented studies of operator functionality usage. The studies that do exist offer some insight but often lack detail or are a minor aspect of a larger study. Chan and Storey (1996) surveyed 256 analysts using Lotus 123 on the functionality of spreadsheets used, see figure 1. The survey was based upon a Likert scale (1 being never and 5 always).The participants indicated how often they use a particular class of function in their spreadsheet and that was recorded on a Likert scale. For example if they never used the Goal Seek function, this would be recorded as a 1 on the Likert scale. The main findings of this study, see figure 1, show that mathematical and statistical functions are the most frequently used and that goal seeking is the least used. However, since this study was conducted on Lotus 123 users, the functional classes are different to that of Excel. Unfortunately the vendor Lotus were unable to provide a detailed functionality listing for Lotus 1-2-3. The difference Considering functional spreadsheet operator usage suggests the value of Example Driven Modelling for Decision Support Systems: Thorne & Ball Page 150 between Lotus and Excel makes direct comparison difficult, e.g. some operators in Excel are not supported in Lotus 1-2-3 and vice-versa. Figure 1 Chan and Storey (1996) Frequency of spreadsheet operator use Ballinger et al. (2003) presented spreadsheet functional data collected from 259 Excel workbooks used to record student marks in a University. Figure 2 shows the results of the survey, in this case the data shows how many operators of a function type were used, i.e. there were 751 logical operators in the 259 workbooks. Figure 2 (Ballinger et al., 2003) Frequency of spreadsheet operator use The data clearly shows that logical and math functions are used more extensively than other classes. Indeed, most classes were not used in the worksheets which suggest that the sample may be biased due to the specific application. The results of this study concur with that of Chan and Storey (1996) to some extent. Both studies identify that mathematical functions are used extensively. However, it is unclear if Chan and Storey (1996) include the Logic operators in their mathematics class. Through Private Communication with Barry Lawson of The Tuck management school Dartmouth College in the US, further data regarding functional utilisation was obtained. The data was extracted Spreadsheet function usage 1 1.5 2 2.5 3 3.5 4 4.5 5 0 2 4 6 8 10 Functions U pt ak e (1 N ev er , 5 A lw ay s) Math/statistical functions Macros Graphics Data import/export What-If Linking Financial functions Database functions Goal seeking Considering functional spreadsheet operator usage suggests the value of Example Driven Modelling for Decision Support Systems: Thorne & Ball Page 151 from a ‘base of knowledge’ gathered at the college via the Spreadsheet Engineering Research Project (SERP). The data was based upon 35 randomly selected spreadsheets that were submitted by the schools alumni. The results of the study are presented in figure 3. Figure 3 suggests there is a disproportionate amount of Financial and Statistical function usage, given the other studies. One possible reason for this apparent bias is that the data was extracted from the Tuck schools alumni who had all studied financial management in some sense on predominantly MBA programmes which traditionally contain financial management aspects. “I would only observe that because we are a business school, one might expect that our contacts may be biased in that direction hence perhaps a larger than random number of financial spreadsheets” Barry Lawson, SERP, Tuck School of Management (2006) Excel function usage by %

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MULTI-ATTRIBUTE DECISION MAKING METHOD BASED ON BONFERRONI MEAN OPERATOR and possibility degree OF INTERVAL TYPE-2 TRAPEZOIDAL FUZZY SETS

This paper proposes a new approach based on Bonferroni mean operator and possibility degree to solve fuzzy multi-attribute decision making (FMADM) problems in which the attribute value takes the form of interval type-2 fuzzy numbers. We introduce the concepts of interval possibility mean value and present a new method for calculating the possibility degree of two interval trapezoidal type-2 fuz...

متن کامل

A robust aggregation operator for multi-criteria decision-making method with bipolar fuzzy soft environment

Molodtsov initiated soft set theory that provided a general mathematicalframework for handling with uncertainties in which we encounter the data by affix parameterized factor during the information analysis as differentiated to fuzzy as well as bipolar fuzzy set theory.The main object of this paper is to lay a foundation for providing a new application of bipolar fuzzy soft tool in ...

متن کامل

Power harmonic aggregation operator with trapezoidal intuitionistic fuzzy numbers for solving MAGDM problems

Trapezoidal intuitionistic fuzzy numbers (TrIFNs) express abundant and flexible information in a suitable manner and  are very useful to depict the decision information in the procedure of decision making. In this paper, some new aggregation operators, such as, trapezoidal intuitionistic fuzzy weighted power harmonic mean (TrIFWPHM) operator, trapezoidal intuitionistic fuzzy ordered weighted po...

متن کامل

Trapezoidal intuitionistic fuzzy prioritized aggregation operators and application to multi-attribute decision making

In some multi-attribute decision making (MADM) problems, various relationships among the decision attributes should be considered. This paper investigates the prioritization relationship of attributes in MADM with trapezoidal intuitionistic fuzzy numbers (TrIFNs). TrIFNs are a special intuitionistic fuzzy set on a real number set and have the better capability to model ill-known quantities. Fir...

متن کامل

A DSS generator for multiobjective optimisation of spreadsheet-based models

Water management practice has benefited from the development of model-driven Decision Support Systems (DSS), and in particular those that combine simulation with single or multiple-objective optimisation tools. However, there are many performance, acceptance and adoption problems with these decision support tools caused mainly by misunderstandings between the communities of system developers an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0803.0164  شماره 

صفحات  -

تاریخ انتشار 2006